Skip to content

ENH: Create script to set up a development environment using docker #39936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

gustavocmaciel
Copy link
Contributor

This script will help the contributors with the process of setting up a docker container. So instead of manually passing the Github username in the Dockerfile and then running a couple of commands, you can run only this:

./docker.sh

...and enter the Github username when prompted.

I think this can be good, right? Especially for the new users.

@jreback jreback added the Docs label Feb 21, 2021
@jreback
Copy link
Contributor

jreback commented Feb 21, 2021

I suppose this is ok, we don't normally recommend using docker at all, but to the extent people use it, having a script is good.

cc @simonjayhawkins @WillAyd

@simonjayhawkins
Copy link
Member

will look soon, can docker build args not be used? or username passed to docker run?

@MarcoGorelli
Copy link
Member

Given that the Dockerfile exists, this seems like a good idea to me - though perhaps this would be better placed in scripts/?

@gustavocmaciel
Copy link
Contributor Author

can docker build args not be used?

Yeah I guess so. That would avoid the sed part.

though perhaps this would be better placed in scripts/?

Sure, but I have to make some changes to make it work there because the docker run command will mount the current working directory into the container, in that case would be the scripts/ dir but we want to mount the root of the project. And the Dockerfile would be one dir up, so the docker build command should be updated.

@gustavocmaciel
Copy link
Contributor Author

the docker run command will mount the current working directory into the container, in that case would be the scripts/ dir but we want to mount the root of the project

Actually there's no need to change this, you can just run from the root of the project.

@gustavocmaciel
Copy link
Contributor Author

I modified the docker build command to pass the Github username as the build argument, and I moved the script to the scripts/ directory (and updated the doc instruction).

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Apr 11, 2021
@simonjayhawkins
Copy link
Member

@gcmaciel There has been some changes to the documentation. Can you merge master and resolve conflicts.

@gustavocmaciel
Copy link
Contributor Author

Can you merge master and resolve conflicts

@simonjayhawkins Done.

@lithomas1
Copy link
Member

@simonjayhawkins @MarcoGorelli Can you guys take a look? Thanks.

@MarcoGorelli
Copy link
Member

MarcoGorelli commented May 11, 2021

@simonjayhawkins @MarcoGorelli Can you guys take a look? Thanks.

Hey - sorry for the delay, and thanks for the ping! I'll try running this next week and'll let you know

EDIT: sorry, haven't had a chance to do this yet

@MarcoGorelli MarcoGorelli self-requested a review May 11, 2021 08:08
@datapythonista
Copy link
Member

Personally I don't see how this is making things easier. To me it makes them more complicated. Running two docker commands seem reasonable, at least the user knows what is doing, and what's going on. Replacing this by a script, and losing visibility of what's going on, doesn't seem an improvement, even if instead of having to copy two commands from the doc, only one is needed.

@gustavocmaciel
Copy link
Contributor Author

gustavocmaciel commented Jun 3, 2021

@datapythonista I understand your point. For someone who's already used to docker, manually running those commands doesn't seem a big deal, but I've seen people here struggling to set up the docker environment and the trickiest part is the one you have to pass your username in the Dockerfile, that's not so intuitive and someone can easily forget to revert the change and/or commit the Dockerfile with his name on it. This script avoids that and makes it easier (in my opinion) to run the container because it replaces a long command that can be hard to remember (the script can be used not only to set up the environment but also to run the container every time you want to work on the project).

Of course I'm suspicious to say because this was my idea but I still think this is good. I've been using this since then and I'll keep using it because I found it better this way.

I don't know if I was clear enough but I hope the best decision to be made (even if the best decision would be not approving this).

@datapythonista
Copy link
Member

and someone can easily forget to revert the change and/or commit the Dockerfile with his name on it

That's not correct. The ARG directive is not meant to be edited in the dockerfile, but as a parameter, for example: --build-arg gh_username=datapythonista

Users who don't know what they are doing they'll simply copy whatever we say in the docs they need to run. This is just a shortcut, so instead of two commands, they need to copy one. To me, not worth having to maintain an extra script in a project already too complex like pandas. Having to support users who will start using a new script...

I think pandas main challenge is to keep things simple, not make it trivial to contribute (with this or without this script, contributing to pandas is not a trivial thing). So, personally, we're looking for trouble with shortcuts like this, trying to solve a problem in my opinion we don't have. Not even sure if it's a good idea to have a dockerfile to contribute to pandas to be honest.

@gustavocmaciel
Copy link
Contributor Author

The ARG directive is not meant to be edited in the dockerfile, but as a parameter, for example: --build-arg gh_username=datapythonista

@datapythonista Yes, that's what the script does:

pandas/scripts/docker.sh

Lines 19 to 21 in 09d75f9

# Pass the Github username as the build variable
read -rp "Github username: " gh_username
docker build --tag pandas-"$USER"-env --build-arg gh_username=$gh_username .

contributing to pandas is not a trivial thing

I agree, maybe the best advice for someone who is starting would be "spend the day reading the docs".

Not even sure if it's a good idea to have a dockerfile to contribute to pandas to be honest.

Yeah I've heard this before. For me it was good because I was already used to docker, but I'm not sure if this is the best way or not.

Anyway, if you think this will add even more complexity to the project, all I have to do is agree with you. You know this project much more than me so I'm sure you know what you are saying. I'm here to learn more than anything.

@mroeschke
Copy link
Member

Thanks again for the contribution. Since there hasn't been a strong champion for this enhancement from the team (IMO I think running the Docker commands explicitly is better as well) and the discussions have stalled, going to close this PR. Can reopen if another team member feels strongly about this enhancement.

@mroeschke mroeschke closed this Sep 8, 2021
@simonjayhawkins
Copy link
Member

Can reopen if another team member feels strongly about this enhancement.

I'm not sure that our current DockerFile is appropriate anyway if we use GitHub codespaces going forward.

(I have changed this locally (and tested with Visual Studio Code Remote - Containers but can't test with GitHub Codespaces as I am on the waitlist for codespaces. I believe pandas does have codespaces access since I see this message... "conda-forge and pandas-dev have Codespaces access but have not enabled it for your account. Contact your organization admin to enable access."

I could maybe try this on a PR if I had access but have been waiting for codespaces on my fork to experiment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants